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A Fault-Tolerant Clock 


The problem: 

In many applications, computers must be fault 
tolerant. They must continue to operate correctly even 
though one or more of the components have failed. 
Such computers must have, among other things, a fault 
tolerant clock to insure that all operations occur in the 
proper sequence. 


The solution: 

An electronic clock has been designed to be in- 
sensitive to the occurrence of faults. It is a substantial 
advance over any known electronic clock. 


How it’s done: 

Let Aj, A 2 , and A 3 be three independent determi- 
nations of the same quantity; then the value of a simple 
majority voter function 


A — (AjA 2 "t" AjA 3 + A 2 A 3 ) 


will change if only one Aj, say A 3 , fails as long as Aj = 
A 2 . But, without accurate timing it is possible for A 3 to 
fail and for Aj and A 2 to be out of step so that 
Aj ¥= A 2 . In this case A = A 3 , and the failure is propa- 


gated; since the clock is itself the timing mechanism, the 
majority voter function will not insure fault tolerance. 

Instead, quorum functions are used. The quorum 
function Q* 1 is defined to be logical “1” if at least i of 
the variables A lf A 2 A n are “1”, and logical “0” 
otherwise. For example: 

Q?= Aj+A 2 +A 3 +A 4 = “1” when at least one ^ = “1” 

Q2 — Aj A 2 +Aj A 3 +A i A 4 +A 2 A 3 +A 2 A 4 +A 3 A 4 = “i” 
when at least two Aj’s = “1” 

Q3 = AjAjA^Aj A 2 A 4 +A 1 A3A4+A2A3A4 = “1” 
when at least three Aj’s = “1” 

Q 4 = Aj A 2 A 3 A 4 = “1” when all four Aj’s = “1”. 

A change in the value of Q is represented by + for 
a “0” to “1” change and by Q 1 ?- for a “1” to “0” 
change. 

A general fault-tolerant clock can be understood from 
the design of a single-fault-tolerant clock with i=l,2,3, or 
4 (see figure). The first element generates Q 2 and Q 3 . 
Each Aj is the output of one of four R-S flip-flops. The 
events 

Q 2 + > Q2 "> Q 3 + > or Q3" 
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may occur. The signals from these events will drive the 
differentiators which set and reset each flip-flop corre- 
sponding to an A i in the following manner: 

Q2+ will set the A j to logical “1”. 

Q2- will be delayed by AT and then set the A - to “1”. 
Q3- will reset the A i to the logical “0”. 

Q^+ will be delayed by AT and then reset the A^ to “0”. 

The normal mode of operation is as follows: 

When two of the four A|’s become 1, the event Q*+ 
occurs. 

The event Q2+ sets the remaining A^’s to “1”. 

The setting of the third and fourth A i to “1” causes 
Q3+ to occur. 

The signal from Q3- is delayed AT and then resets A^ 
to “0”. 

When any two A^s become “0”, Q3- occurs and resets the 
remaining two A-’s to “O’ 5 . 

The resetting of the third A* to “0” causes Q*- to 
occur. 

The signal from Q2- is delayed AT and sets the A- to 

« 2 ” 

When two of the four A^s become “1”, the event Q2+ 
occurs. 

With a single fault one A| is replaced with an 
indeterminante quantity. The behavior of the four- 
variable quorum function may, in this case, be described 
in terms of three-variable functions of the nonfailed 
elements. 

For instance, the event Q2+ will occur at Qj + (if the 
indeterminante A| happens to be “1”) or at Qf+ (if the 
indeterminante Aj happens to be “0”). In this way, four- 
and three-group functions are related as below: 

Q2+ will occur between Qf+ and Q|+; 

Q3+ will occur between Q2+ and Q3+; 

Q3- will occur between Q3- and Q2-; and 
Q2- will occur between Q|- and Qj-. 


A cycle of events occurs as in the unfailed case. Sinr.* 
however, only three of the Aj’s are known, the cycle is 
defined in terms of the three-group functions. 

The sequence of events is unchanged in the failed 
mode because the interval in which Q* is indeterminate 
does not overlap the interval in which Q3 is indetermi- 
nate. Because the sequence is unchanged, the frequency 
is unchanged. 

A general fault-tolerant clock, which will tolerate r 
faults, can be made by using functions and Qy where 
x and y are chosen as follows: 

n>3r+l,x>r+l, and y > 2r + 1 . 

The modes of operation are essentially the same as in 
the single-fault-tolerant clock. A system element can 
generate a valid clock signally a simple majority vote 
among any 2r + 1 of the 3r + 1 A^’s. 
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